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HIERARCHICAL FILE SYSTEM TO PROVIDE 
CATALOGING AND RETRIEVAL OF DATA 


This is a continuation of application Ser. No. 924,802 


fed Oct. 30, 1986 now abandoned. 
BACKGROUND OF THE INVENTION 


1. Field of the Invention 
The present invention relates to the method of storing 10 
and retrieving data using a computer, and more specifl- 
cally to a hierarchical filing system. 
2, Prior Art 
In a computer system, Information is typically stored 
mediums, such as magnetic 15 


technology, it becane posaible for a device to store 
mauch more information than previously. © 


When information is gored on a device, it is cata- 20 


loged 90 that the same information is later retrieved 
when desired. Normally, a unique code name is attrib- 
uted to a particular body of data to differentiate it from 
others. To retrieve a desired body of data, an appropri- 


ate code name associated with that data is used, wherein 25 


the device searches for that code name and retrieves the 
desired data when that code name is found. 


cataloging system. When 


high-density storage devices 
are used, millions of bits of information are capable of 35 


being stored on such a device, which permits hundreds, 
thousands, and even millions of files to be created. To 
search through these files in a serial fashion to look for 


a specific file is 


time-consuming. 
It is appreciated that what is needed is a filing system 40 


’ for a high-density storage medium which rapid! 

searches and retrieves the desired file stored. Further, 
with the advent of the personal computer (PC) and the 
small business computer, where physical size is a con- 


cer, it is desirable to have a filing system which may be 45 


Ra ce eal 


SUMMARY 


A method for providing a hierarchical filing system is 50 


described. The hierarchical filing system provides a 

catalog of the data stored in various locations within a 

_ temory device. Typically, one cataloging structure is 
used to organize a volume of mémory. 


The cataloging structure of the hiearchical filing 55 


system is provided by an upside-down tree type struc- 
ture wherein there is a starting directory which oper- 
ates as a root directory. Other directories and files ema- 
nate as off-spring. A plurality of descendant levels 


branch downward to provide the hierarchical structure 60 


of the catalog. The cataloging structure contains the 
location information of where the actual data is stored. 

The file cataloging system is implemented using a 
B-Tree. The cataloging information is kept in the leaf 


nodes of the B-Tree. The non-leaf nodes (index nodes) 65 


of the B-Tree contain information that allows searching 
for particular catalog information by using the code 
name or key of the corresponding file. Key values, 
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which are used to identify and catalog various files in 
the catsloging system, are also used to orgeaize the 
catalog in the lear nodes of the B-Tree. The keys are 
placed in an ascending order for systematic access. ~ 
Further, the B-Tree grows by using left rotates and left 
splits with insertion of catalog information about new 
files from the right to maintain a balanced tree. 

When a file's data is stored, additions, deletions and 
modifications will typically result in non-contiguous 
physical storage of the data in the memory device. Each 
of the contiguous segments of the file is known as a file 
extent. A record of the physical location of the extents 
for a particular file is maintained in one or more extents 
records. The hierarchical filing system uses a file extents 
list to maintain the extents records of the various files on 
the memory device. 

The present invention maintains the first extents re- 
cord of a file in the cataloging structure, but any further 
extents records are maintained in a separate file extents 
list. This file extents list is also implemented in a second 
B-Tree structure. 


BRIEF DESCRIPTION OF THE DRAWINGS 


FIG. 1 is a representation of a prior art flat filing 
system. 

FIG. 2 is a representation of a hierarchical filing 
system of the present invention. 

FIG. 3 is a representation of a B-Tree structure of the 


FIG. 5 ia a representation of a left-split and a left- 
rotate operation of a B-Tree structure of the preferred 
embodiment. 

FIG. 6 is a representation of a cataloging structure of 
the preferred embodiment and an organization of the 
cataloging structure in various nodes of a B-Tree. 

FIG. 7 is a representation of a volume allocation 
mapping in a filing system of the preferred embodiment. 

FIG. 8 is a representation of a file extents list of the 

embodiment and showing various file extents 


memory. 

FIG. 9 is a representation showing the file extents 
organization in the Catalog and Extents B-Trees of the 
preferred embodiment. 


DETAILED DESCRIPTION OF THE 
PREFERRED EMBODIMENTS 


The present invention describes a method of storing 
and retrieving information using a hierarchical filing 
system. In the following description, numerous 
details are set forth in order to provide a thorough 
understanding of the present invention. It will be obvi- 
ous, however, to one skilled in the art that the present 
invention may be practiced without these specific de- 
tails. In other instances, well-known methods have not 
been described in detail in order not to unnecessarily 
obscure the present invention. : 

Referring to FIG. 1, a prior art flat filing system 10 is 
shown having a directory 11 and files 12-15. For ease of 
understanding, a directory is shown pictorially as a 
folder and a file is shown as a sheet of paper with a 
folded corner. The pictorial representation applies well 
to an analogy of placing papers into folders (files into 
directories). In the prior art system 10, there is present 
a single directory 11, which contains locator informa- 
tion for files 12-15. Each of the files 12-15 contain data 
which would be associated with a specific body of 
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stored information. In this particular example of a prior 
art system 10, to access file 15, a serial search is made 
through directory 11, until the file address of file 15 is 
located, such sequential search resulting in considerable 
lapse of time when substantial numbers of files exist in 
the directory 11. Although in this hypothetical exam- 
ple, directory 11 maintains pointer addresses to four 
files 12-15, directory 11 will continue to store addresses 
of subsequent files in a sequential fashion. 

FIG. 2 illustrates the architecture of the Hierarchical 
Filing System (HFS) of the present invention. This 
narticular HFS 16 includes a root directory 17 and files 
31-34. The HFS 16 also includes directories 18-20. 
Esch directory is capable of containing files, as well as 
other directories such as directory 18 containing direc- 
tory 20, Each directory is a branching node, allowing 
for none or a plurality of sub-branching nodes. Each 
directory contains infurmation which permits the 
branching to occur. The actual data is stored in the files 
21-24, Because each file is a termination node, it does 20 
not need to maintain further branching information. 
Instead, each file stores the actual data. Therefore, the 
directories 17-20 maintain branching information, 
while files 21-24 contain the stored data. 

HES 16 accesses files 21-24 in a hierarchical fashion 25 
a0 that serial search for the files is not necessary. As- 
sume in the example of FIG. 2 that access to data stored 
in file 23 is desired. A search of directory 17 reveals that 
two possible paths exist in seeking the address of file 23. 
One path from directory 17 leads to directory 18 and 0 
the other path leads to directory 19. The desirable path 

’ is to directory 18, at which point there are again two 

paths. The desirable path from directory 18 leads di- 
rectly to file 23. Although this example is simplistic 
because of the miniscule number of files shown, one can 35 
appreciate the file search time saved when a substan- 
tielly large number of files are present. 

Further, as an example, if file 22 had been chosen, the 

4 path from directory 18 would have led to directory 20, 

at which point two paths exist from directory 20. The 40 

desirable path to file 22 from directory 20 then would 

” have been chosen. HFS 16, although shown in a partic- 
ular form in FIG. 2, may have any number of levels 
(branchings) down from the root directory 17 as well as 
any number of branches from a particular directory. 45 
However, it is to be noted that all data is stored in the 
represented files 21-24 which are all located at the 
termination nodes of HFS 16. 

In actuality, the cataloging architecture of the pre- 
ferred embodiment contains cataloging locator descrip- 50 
tion information in the HFS 16 structure. The catalog 
entries for files 21-24 contain pointers which provide 
locator descriptions to locate places in storage area 
where actual stored data is kept. 


B-TREE 


The HFS of the present invention is implemented 
using two B-Tree structures in the preferred embodi- 
ment, the Catalog B-Tree and the File Extents B-Tree. 
A B-Tree structure is well-known in the prior art and is 
described in The Art of Computer Programming Volume 
3 (Sorting and Searching); by Donald E. Knuth; at 
Section 6.4; titled “Multiway Trees"; pp 471-479 
(1973). The nodes of a B-Tree contain records, wherein 
each record is comprised of certain information, either 
pointers or data, and a key associated with that record. 

Referring to FIG. 3, a hypothetical B-Tree is illus- 
trated. A basic feature of the B-Tree 31 is that data is 
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Referring to FIG. 4, it shows the structure of any of 


nodes as a way of quickly moving through the records 
of the various nodes at a given level. For each node, 
NDBLINK §$2 contsins a pointer to the previous node, 


“LisaPatentMacHFS 08.PICT” 513 KB 1999-03-07 dpi: 400h x 404v pix: 2325h x 3417v 


@ Apple Lisa Computer Technical Information Page 0009 of 0014 


@ Apple Computer Patent -- Macintosh Hierarchical File System (HFS) 


4,945,475 


5 
and NDFLINK §1 contains a pointer to the subsequent 
node at the same level. In FIG. 3, NDBLINK for node 
36 would point to node 35 and NDFLINK for node 6 
would point to node 37. Therefore, NDBLINK 52 and 
NDFLINK §1 are means of locating adjacent nodes 
without first reversing back up the B-Tree. 

The records segment 44 contains the B-Tree’s re- 
cords, each with its key and pointer or data information. 
In this particular example, there are two records 60 and 
61. The records in a node can be of variable length. For 
this reason, offsets to the beginning of each record are 
needed. The records segment begins immediately fol- 
lowing the node descriptor segment 43. The records are 
followed by a free space segment 45, which is basically 
the unused space of the node. Therefore, free space 
segment may not exist in some instances. The record 
offset segment 46 at the end of the node contains the 
offset information for records 60 and 61. Offset 68 con- 
tains offset information for record 60 and offset 67 con- 


tains offeet information for record 61. Offeet 66 contains 20 


the offset necessary to determine free space 62. Thus the 
record segment 44 builds downward into the free space 
segment 45, while the record offset segment 46 builds 
upward into the free space segment 45 from the oppo- 
site end. 

If node 42 is an index node, then each record 60 and 
61 is comprised of a key and pointer information. Fur- 
ther, NDFLINK 31 and NDBLINK $2 would contain 
adjacent index node linking pointers. If node 42 is a leaf 
node, then each record 60 and 61 is comprised of a key 
and data information. NDFLINK 51 and NDBLINK 
52 would also contain leaf node linking pointers. It is 
also appreciated that although a particular format is 
illustrated for node 42, the format may be modified 


25 


x) 


readily to include other types of information. Also, in 35 


the preferred embodiment data information in the leaf 

nodes of the HFS catalog B-Tree is used to address 

locations in memory where the actual data is stored. 
Referring to FIG. 3, a specialized B-Tree expansion 


architecture as implemented in the preferred embodi- 40 


ment is shown. A node 70, which is equivalent to node 
42 of FIG. 4, is shown having pointers to two lower- 
level nodes 71 and 73, which may be index or leaf 
nodes. Although only two nodes 71 and 73 are shown at 
the lower level, any number of nodes may reside at this 
lower level. Also in this particular hypothetical exam- 
ple, nodes 71 and 73 are only partially filled. 

For a B-tree to maintain Its balance, records must be 
kept uniformly spaced within the hierarchical structure. 
An unbalanced tree will result when records are not 
maintained uniformly in each node or nodes are heavily 
stacked toward one branch of the B-Tree. The pre- 
ferred embodiment uses a technique of left rotate and 
lef splits to provide movement of records from one 
node to another to maintain a balanced Tree. When 


records are to be transferred to another node, the left - 


rotate operation is used. In this instance, records in node 
73 are left rotated to its left adjacent node 71, as shown 
by arrow 77. 

If another node is needed, such as when records in 
node 73 must be rotated and node 71 cannot accommo- 
date records from node 73, a left split operation is used 
to insert node 72 to the left of node 73, between nodes 
71 and 73. In this instance, node 72 is inserted to link 
node 71 and node 73, as shown by arrows 78. When 
node 72 is inserted, appropriate pointer links will be 
established with its index node 70 as well as adjacent 
link pointers for nodes 71 and 73. Continually moving 
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data leftward and inserting new data at the right ex- 
tremities helps keep the B-tree balanced. Because the 
HFS of the present invention is structured to have the 
ascending nodes organized in a rightward direction, the 
balancing is maintained even though the rotates and 
splits are made toward the left direction. It is appreci- 
ated that right splits and rotate operations, or balanced 
insertions using both right and left operations can be 
used as well. Although the preferred embodiment uses 
and attempts to maintain a balanced B-Tree for search 
efficiency, most any B-Tree structure can be used, in- 
cluding unbalanced B-Tree. 


CATALOG TREE 


Referring to FIG. 6, a hypothetical catalog 90 is used 
to illustrate the implementation of cataloging of the 
preferred embodiment. The structure 90 has a root di- 
rectory 91 named “Volume”. Each directory of the 
preferred embodiment is assigned a unique numerical 
identifier known as the directory idertifier (DirtID). 
The root directory of catalog 90 has DirID value of 2. 
Root directory 91 has three branches comprised of 
directory 92 and files 93 and 94. Directory 92 has a 
name of “Folder” and a DirID value of 29. In turn, 
directory 92 has two branches comprised of files 93 and 
96. Files 93-96 are named “A”, “B”, “C" and “D", 
respectively in this example. The architecture of the 
directories and files follows the HFS structure as previ- 
ously explained in FIG. 2. The complete cataloging 
structure 90 is stored as data records in various leaf 
nodes of the B-Tree of FIGS. 3 and 4 known as the 
catalog B-Tree. It is appreciated that the cataloging 
structure 90, although a tree, is in itself not a B-Tree. 
The form of structure 90 is actually stored in the various 
leaf nodes of a B-Tree. It is to be appreciated that the 
cataloging structure 90 not be confused with the previ- 
ous description of the B-Tree. Catalog 90 and the B- 
Tree structure are (wo separate and distinct structures. 
The hierarchical structure of the catalog 90 is imple- ° 
mented as a B-Tree structure and stored as data records 
in leaf nodes of a B-Tree similar to that of FIGS. 3 and 
4 


The hierarchical catalog structure 90 is stored in a 
storage device as shown by a memory map 97 of FIG. 
6. Cataloging map 97 is comprised of three possible 
types of records: directory records 100, file records 101, 
and thread records 102. Each record 100-102 is com- 
prised of a key 103 and information segment 104, as 
earlier described in the description of a leaf node of a 
B-Tree. The key 103 of each record is comprised of a 
value 105 and a name 106. The key 103 of a directory 
record, such aa that of 91 and 92, is comprised of its 
directory name 106 and its parent directory’s DirID 
value 105. A information segment 104 of each directory 
record, such as that of directories 91 and 92 is com- 
prised of the directory’s DirID value 107. For directory 
92, the directory’s DirID has been given the value of 29, 
and has a name of “Folder”. The parent DirlD of re- 
cord 92 has been given the value 2 because directory 92 
is an offspring of directory 91 in the structure 90. Direc- 
tory record 91 has a directory DirID value of 2, with a 
corresponding name of “Volume”. Because directory 
91 is a root directory, the parent DirID value has been 
given the value of 1, wherein the value | refers to the 
foundation of the filing system itself. 

A file record, such as file records 93-96, is also com- 
prised of a key 113 and an information segment 114, 
wherein key 113 is also comprised of a parent DirID 
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7 
value and a name. However, in the information segment 
114, the descriptive location information for the actual 
stored file data is maintained as well as a unique file 
number. The information segments 114 of file records 
93-96 contain the descriptive location of the actual 
stored data information. 

File record 94, having a file name of B, and file record 
93, having a file name A, both have a parent DirID 
value of 2. The parent DirlID value of 2 signifies that 
files A and B are direct offsprings of directory “Vol- 
ume” having a DirID value of 2. File 95, having a name 
C, and file 96, having a name D, have parent DirED 
values of 29, which reflect the origination of files C and 
D as offtprings of directory 29 labeled “Folder”, having 
a DirlD value of 29. Therefore, by looking at any file or 
a directory record’s key 103, the stored Information 
provides the identification of the name of that particular 
record as well as the D’:ID value of the parent node. 

To provide the interconnection of the different 
branches, a thread record 102 is provided for each di- 
rectory. The key of a thread record contains a DirlD 
value and a null-name, which is equivalent to having no 
name at all. In the example of FIG. 6, thread record 108 

rovides the connection between the 


Pp directory 
“Folder” and files C and D. In the key 111 of thread 25 


record 108, only the directory DirID value of “Folder” 
is given. In the information segment 112 of thread re- 
cord 108, the DirID of “Folder’'s parent and the direc- 
tory'’s name “Folder” are given. Therefore, when file C, 


having a parent DirID 29 attempts to link to its immedi- 30 


. ate parent directory 92, which has a DirID of 29, the 
.thread record 108 provides the name (Folder) of the 
parent directory 92, as well as the parent DirID value of 
92, which is equal to 2. 
Equivalently thread record 109 
(Volume) of directory 91 as well as its parent directory 
DirID value for the three offsprings 92-94 of directory 
91. By having directory records 91-92, ‘fille records 
. 93-96, along with thread records 108-109 for each di- 


<Tectory, the cataloging structure 90 is interconnected 40 


‘into a HFS, wherein the descriptive location informa- 
tion for the actual stored data is stored in file records 
91-92 as shown in the structure 97 of FIG. 6. 

By implementing the cataloging structure 90 using a 
B-Tree structure, the hierarchical 
atructure 90 is easily stored in the leaf nodes of a B-Tree 
of the earlier description. For example, when file C is to 
be accessed by a computer, the system will implement a 
B-Tree search. Referring to the catalog example 90 of 


FIG. 6, when file with name C Is to be found, the search 50 


path must be specifled for this search. This can be given 
in terms of a sequence of the names of all directories on 
the path from the root to the said file, thus “Volume”, 
followed by “Folder”, and finally “C”. The search 
begins by finding the directory record in the Catalog 
B-Tree that corresponds to “Volume”. Its name is 
“Volume” and since it ia the root, its parent DirlD 
value is 1. The catalog B-Tree is searched for a direc- 
tory record with key <1> Volume; thus, directory 
record 91 is found. Its information segment then pro- 
vides the DirlD value 2 of this directory. Now a search 
is made through the B-Tree for the record with key 
_€2> Folder which leads to the directory record 92, 
whose information segment provides this directory's 
DirlD value of 29. Thus now a search of the B-Tree is 
made to find the data record with key <29>C. This 
immediately leads the search to the fila record 95, 
whose information segment contains the information 
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of 45 


about the physical location of the data contained in the 
desired file. 

It will be appreciated that the specification of the file 
of the above example could start with the DirlD value 
of any directory on the path from the root to the desired 
file, and would then consist of this DirID value and the 
sequence of names of the directories on the balance of 
the path from that directory to the desired file. The 
search mechanism followed is an obvious variant of the 
one indicated above. 

Although cataloging structure 90 is a simplified struc- 
ture and FIG. 6 only shows the presence of a single 
structure having a single root directory 91, a cataloging 
structure may be enlarged manyfold. The preferred 
embodiment uses one HFS cataloging structure per 
memory device, such as a disk. However, such a disk 
can be partitioned and an HFS catalog assigned to each 
such partition. 

The catalog records of structure 97 of FIG. 6 are 
stored as the data records in the leaf nodes 42 of FIG. 4 
of a ¢atalog B-Tree. These records are inserted and 
maintained in the catalog B-Tree in ascending alphanu- 
meric order. Thus, if the leaf nodes of the B-Tree are 
traversed from left to right, the data records will be 
encountered in the order shown in structure 97 of FIG. 
6. This order maintains the records in ascending order 


the name 35 
as 


cords include such items as flags for locking 
values to set logical and physical end of files, and size of 
the file. 


FILE EXTENTS TREE 


memory 

The memory device is considered to be a sequentially 
numbered collection of blocks. A series of contiguous 
memory blocks is called an extent. Ideally, a file would 


case of preallocated or small files, the contents of a 

file are usually stored in more than one ex- 
tent, separated into non-contiguous sections on a vol- 
ume. Each file extent can be identified by an extent 
descriptor. Thus, the complete location information of a 

file is a sequential extents list consisting of the 
extent descriptors of the various extents containing the 
file's data. . 

The file extents list of the present invention is orga- 
nized also as a B-Tree, known as the File Extents B- 
Tree, and records the volume location and size of the 
various extents that comprise the files. Although most 
any memory allocation system can employ the file ex- 
tents record of the present invention, a specific memory 
allocation system is described to illustrate the file ex- 
tents record of the preferred embodiment. 
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Referring to FIG. 7, a memory volume 120 which is 
a portion of a memory device, such as a hard disk, is 
shown. Volume 120 is segmented into a number of logi- 
cal blocks 126. Typically, each logical block 126 is 
comprised of a predetermined fixed number of bytes, 
such as 512 bytes for the preferred embodiment. A fixed 
number of logical blocks starting at block 0 and ending 
at block n is reserved for volume information. The 
balance of the memory device starting at block n+1 is 
available for data storage and this storage area is sepa- 
rated into allocation units, wherein each allocation unit 
is comprised of one or more coniiguous logical dlocks. 

Volume 120 inciudes four areas 121-124. System 
start-up aces 121 contains certain configurable system 
parameters which are well-known in operating a disk or 
other memory devices. Volume information area 122 
contains information regarding the housekeeping pa- 
rameters of the volume, :uch as number and size of each 
allocation unit. Volume bit map 123 maintains record of 
each allocation unit on the volume 120 and uses a bit 
map to designate use or non-use of each allocation unit. 

Commencing at block n+1, a file content area 124 
extends to the end of the Volume 120. File content area 
124 is separated into a number of allocation units, 
wherein each allocation unit is comprised of a fixed 
number of logical blocks. While the bit map 123 main- 
tains volume space management, it does not provide file 
mapping. The file mapping function is provided by the 
file extents lists. 

Referring also to FIG. 8, a portion of file contents 
area 124 is shown containing information attributed to a 
file labeled file E. In this hypothetical example the en- 
tire contents of file EB are separated into seven extents 
125-131. The first portion of the file is stored in base 
extent 125, the subsequent portions of the file are dis- 
tributed accordingly in extents 2-7 which are labelled 
126-131. File BE has seven extents 128-131 which are 
not physically contiguous. To maintain file extents in- 
formation an extent descriptor 140 is used for the base 
extent 125 and each of the subsequent extents 126-131 
of file E. 

Extent descriptor 140 is comprised of a starting allo- 
cation unit number 141 and number of allocation units 
142. File E extents list 138, which is comprised of seven 
extent descriptors 12$a-131a, provides information as 
to the address and length of each extent 125-131 of file 
E. For example, the fourth extent 128, which has a 
starting allocation address of 189 and is only two alloca- 
tion blocks long, has a value of 189 in fleld 141 and a 
value of 2 in field 142 of descriptor 1282. 

Extents descriptors of all files in a volume are main- 
tained in the present invention in the data records con- 
tained in the leaf nodes of B-Tree such as of FIGS. 3-S. 
This tree is known as the File Extents B-Tree and is a 
separate B-Tree from the earlier described catalog B- 
Tree. Each data record of this extents B-Tree consists of 
a key and an information segment as before in the dis- 
cussion of FIGS. 3-5. The information segment of a File 
Extents B-Tree data record is comprised of a sequence 
of extents descriptors of a particular file. The maximum 
number of extents descriptors in such a record can vary 
from implementation to implementation, but in the pre- 
ferred embodiment is set to three. The key of the File 
Extents B-Tree record consists of two flelds: the file 
number of the particular file and the file relative posis- 
tion of the starting block of the first extent descriptor in 
that record. These extents records are kept in the leaf 
nodes of the Extents B-Tree sorted in ascending order 
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first on the file number field and then on the file relative 
position of the starting block. This allows efficient 
search through the B-Tree for the location information 
of data at 2 particular file relative position. 

In actuality, the preferred embodiment stores three 
extents descriptors, base plus two subsequent extents 
descriptors, the information data segment 114 of the 
file’s catalog B-Tree record such as 94 of FIG. 6. There- 
fore, in the example of FIG. 8, extent descriptors 125a, 
126a and 127 are kept in the information segment of 
the cataloging structure and extents 1280-131 are kept 
in the File Extents B-Tree as shown in FIG. 9, Permit- 
ting limited extent information to be kept in the data 
segments of a cataloging structure permits faster access 
to data. Only when a file contains four extents or more, 
will it need to consult the File Extents B-Tree. It should 
be appreciated that the number of extents which are 
kept in the file's Catalog B-Tree record without using a 
File Extents B-Tree is arbitrary and can be changed 
without departing from the spirit and scope of the in- 
vention. 

Also referring to FIG. 9, it shows a catalog file re- 
cord 145 and File Extents B-Tree records 143 and 144. 
As.explained in the structure of B-Trees of the present 
invention, each record 143 and 144 is comprised of a 
key 148 and 149 and extents list 146 and 147, respec- 
tively. To locate a certain portion of the data of a partic- 
ular file, firet the Catalog B-Tree is searched for the 
corresponding file record. From this file record's Infor- 
mation segment, the file number is extracted. Also, the 
first three extent descriptors in the information segment 
of the catalog B-Tree file record are examined. If the 
required file data is contained within the corresponding 
extents, then the location information is. now readily 
available. If however, the desired file data is located in 
extents beyond the three in the catalog's file record, 
then a search is made of the File Extents B-Tree using as 
a search key the file number and the computed file 
relative block position of the desired data. This search 
will lead to the file extent’s B-Tree record containing 
the desired location information. 

The example using file E is comprised of 22 blocks 
and having an arbitrary file number equal to 20. The 
extent descriptors contained in the catalog file record 
145 for file E provide the location information for the 
first 3 extents which in turn comprises the first 9 blocks 
(3+3+1) of the file. The location information for the 
remaining {3 blocks (2+3+1+7) of the file is con- 
tained in two data records 143 and 144 within the File 
Extents B-Tree. Assume that the desired data is at file 
relative block position 13 within file E. The extent de- 
scriptors contained in the file's catalog record are exam- 
ined first. Since relative block 13 is greater than the 
number of blocks located by the extent descriptors in 
the file's catalog record, the File Extent B-Tree is 
searched. The key used for the B-Tree search for rela- 
tive block position 13 is <20,13>. 

Since the key value of “13” is greater than the value 
“9” of key 148 for the first Files Extents B-Tree record 
143 for file E and is less than the value "15" of key 149 
for the second record 144, the search results with a “not 
found" result but positions to the second B-Tree record 
144, By retrieving the previous record 143 of key 14%, 
the extent descriptor for relative block 13 is obtained. 
The value of “9” for key 148 is derived because extents 
list 146 starts at the tenth relative block (allocation unit 
number 9). The value of “15” for key 149 is derived 
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because extents list 147 starts at the sixteenth relative 
block (allocation unit number 15). 


IMPLEMENTATION 


The HES of the present invention is implemented ina 5 
computer which is coupled to a memory device, such as 
a disk, having an ability of storing millions of bits of 
information, although any storage medium can use the 
HFS. Typically, the HFS of the present invention pro- 
vides the cataloging of various groupings of data, such 
as files, which are stored on the disk. 

The preferred embodimenc implements data storage 
by tne use of a cataloging structure previously de- 
scribed to catalog ds‘a stored on a large capacity mem- 
ory device. It also maintains a file extents record of up 
to three extents per file in the catalog. Subsequent ¢x- 
tent information is stored in a separate file extents re- 
cord. Both the catalog reourd and the extents record are 
maintained using two B-Trees of the earlier described 
B-Tree structure. 

The HFS as described in the preferred embodiment is 
controlled by a combination of hardware and software 
in a computer system. The HFS controlling routines are 
stored in a separate storage device than the device used 
for storing the actual data. The preferred embodiment 
stores the routines in a read only memory (ROM), al- 
though most any storage medium may be used. 

Thus, a hierarchical filing system for use with a large 
capacity memory device in described. 

We claim: 

1. In a computer, a hierarchial filing system to pro- 
vide cataloging and retrieval of data stored on a storage 
device, said hierarchial filing system comprising: 


a memory for storing a program for said cataloging 35 
and retrieval: 
a processor coupled to said memory and said storage 


device for processing an organizing means to cata- 
Poe 8 cemnevee Senate eee eee 
ig; 

said program for organizing said data on said storage 

device into a hypothetical catalog which has a root 

directory, a plurality of branching directories ar- 

ranged at various subsequent levels from said root 

directory, wherein some of said branching directo- 

ries branch from other of said branch directories; 

said branching directories being interconnected 

such that for each of said branching directories 
there is only a singular path from itself to said root 
directory; and wherein some of said branching 
directories have at feast one file, each file corre- 
sponding to a representation of a predetermined 
portion of said stored data; 

an assigning means for assigning a unique identifica- 
tion value to said root directory and each of said 
branching directories, and assigning an identifica- 
tion name to each of said files, root directory and 
branching directories, wherein each of said branch- 
ing directories and files are each provided with a — 
key comprised of its identification name and its 
next higher level directory identification value; 

a list forming means for forming a linear list of files 
and directory entries such that said file and direc- 
tory entries are ordered by said keys, such that said 
root directory being the highest level and files 
heing the lowest level; and said interconnection of 
each of said singular path is provided by each file 
and branching directory identification name being 
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associated with directory identification value of its - 
next higher level; 

a structure forming means for forming a B-Tree in- 
dexing structure having a beginning node, a plural- 
ity of indexing nodes and a plurality of terminating 
nodes, and wherein said linear list is stored in said 
terminating nodes of said B-Tree indexing struc- 
ture. 

2. The hierarchial filing system defined in claim 1, 
wherein said memory for storing said program is a read 
only memory. 

3. In a computer system where data is to be cata- 
logued when stored into a memory device, a method 
performed by the computer syatem for providing a 


5 hierarchial fillog system to catalogue said data into a 


volume of said memory device for subsequent retrieval, 
comprising the steps of: 
creating a root directory, a plurality of subdirectories 


wherein each of said subdirectories and files are each 
provided with a key comprised of its identification 


being 

identification value of its next higher level; 

forming a B-Tree indexing structure having a begin- 
ning node, a plurality of indexing nodes, and a 
plurality of terminal nodes; 

storing said linear list in said terminal nodes of said 
B-Tree structure in alphanumerical order accord- 
ing to said numerical directory value; 

assigning said identification name of a given fle to a 
respective portion of said data; 


data stored in said memory device. 
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4 The method as described in claim 3 wherein said 
step of forming said B-Tree indexing structure further 
comprises thé step of forming a B-Tree structure 
wherein said beginning node comprises a root node of 
said B-Tree, said indexing nodes comprise branch nodes 
of said Be-Tree, and said terminal nodes comprise leave 
nodes of said B-Tree. 

§. The method as described In claim 4 wherein said 
step of placing location information in said files com- 
prises the step of providing a plurality of extent point- 
ers, each extent pointer pointing to a location of a por- 
tion of said data stored in said memory device corre- 
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sponding to said given file such that non-contiguous 
data segrcents are made to correspond to each said ile. 

6. The method as described in claim $ further com- 
prising the step of forming a second B-Tree structure to 
store a linear list of additional extent pointers for those 
files which have more extent pointers than that which 
can be stored in each file, said linear list of additional 
extent pointers being stored in terminal nodes of said 
second B-Tree structure by having each additional ex- 


tent pointer stored in one of said terminal nodes. 
e e * e * 


